This is a submission for assignment Visualization in R using ggplot2 Link to assignment
The content covers:
rows <- nrow(df)
cols <- ncol(df)
datatable(df)
## Warning in instance$preRenderHook(instance): It seems your data is too big
## for client-side DataTables. You may consider server-side processing: https://
## rstudio.github.io/DT/server.html
Data frame has 22 variables/columns and 16433 measurement/rows in marathon data set
it can be said the most of the participants are from young and middle age and the pattern is similar for both gender.
it can be seen from the below diagram that men have finished the race faster than women, but the overall distribution of data remains the same which can imply that both gender are competing optimally given that number of female participants is around 5607 while men participants is around 10826.
it can be seen that older women have slightly more disqualification as compared to men of the same age group
it can be seen from the plot that gun time is not a reliable rather a ceremonial way to measure finish time , as it deviates a-lot from actual the chip time
based on race finishing time , participants are runner if the finishing time is below 3hrs , jogger if finishing time is between 3hrs and 5hrs and walkers if finishing time is more than 5hrs
conclusions can be drawn that, fast finishers at early stages are more likely to have better overall position, which is obvious
## figure below shows positons of top 10 finishers at different stages of marathon
it can be seen that David (2nd position) actually performed better through out the race except for the last stage. He has a high chance of winning any future marathon as his performance consistent